Neural Gas Clustering for Dissimilarity Data with Continuous Prototypes
نویسندگان
چکیده
Prototype based neural clustering or data mining methods such as the self-organizing map or neural gas constitute intuitive and powerful machine learning tools for a variety of application areas. However, the classical methods are restricted to data embedded in a real vector space and have only limited applicability to noneuclidean data as occurs in, for example, biomedical or symbolic fields. Recently, extensions of unsupervised neural prototype based clustering to dissimilarity data, i.e. data characterized in terms of a dissimilarity matrix only, have been proposed substituting the mean by the so-called generalized median. Thereby, the location of prototypes is chosen within the discrete input space which constitutes a severe limitation in particular for sparse data sets since the prototype flexibility is restricted. Here we present a generalization of median neural gas such that prototypes can be interpreted as mixtures of discrete input locations. We derive a batch optimization scheme based on a corresponding cost function.
منابع مشابه
Magnification Control in Relational Neural Gas
Prototype-based clustering algorithms such as the Self Organizing Map (SOM) or Neural Gas (NG) offer powerful tools for automated data inspection. The distribution of prototypes, however, does not coincide with the underlying data distribution and magnification control is necessary to obtain information theoretic optimum maps. Recently, several extensions of SOM and NG to general non-vectorial ...
متن کاملClustering Algorithm for Incomplete Data Sets with Mixed Numeric and Categorical Attributes
The traditional k-prototypes algorithm is well versed in clustering data with mixed numeric and categorical attributes, while it is limited to complete data. In order to handle incomplete data set with missing values, an improved k-prototypes algorithm is proposed in this paper, which employs a new dissimilarity measure for incomplete data set with mixed numeric and categorical attributes and a...
متن کاملA supervised growing neural gas algorithm for cluster analysis
In this paper, a prototype-based supervised clustering algorithm is proposed. The proposed algorithm, called the Supervised Growing Neural Gas algorithm (SGNG), incorporates several techniques from some unsupervised GNG algorithms such as the adaptive learning rates and the cluster repulsion mechanisms of the Robust Growing Neural Gas algorithm, and the Type Two Learning Vector Quantization (LV...
متن کاملTopographic mapping of dissimilarity datasets
A great challenge today, arising in many fields of science, is the proper mapping of datasets to explore their structure and gain information that otherwise would remain concealed due to the high-dimensionality. This task is impossible without appropriate tools helping the experts to understand the data. A promising way to support the experts in their work is the topographic mapping of the data...
متن کاملTopographic Mapping of Large Dissimilarity Data Sets
Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectorial data sets; they require a classical feature encoding of information. Often ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007